JCO Precision Oncology — Latest Matching Preprints

1

Impact of surveillance colonoscopy on colorectal cancer incidence and mortality in Lynch syndrome - a national observational cohort study of patients in the English NHS 2010-2022

Huntley, C.; Loong, L.; Mallinson, C.; Rahman, T.; Torr, B.; Allen, S.; Allen, I.; Hassan, H.; Fru, Y. W. J.; Tataru, D.; Paley, L.; Vernon, S.; Houlston, R.; Muller, D.; Lalloo, F.; Shaw, A.; Burn, J.; Morris, E.; Tischkowitz, M.; Antoniou, A. C.; Pharoah, P. D. P.; Monahan, K.; Hardy, S.; Turnbull, C.

2026-04-22 oncology 10.64898/2026.04.16.26351020 medRxiv

Top 0.1%

2.3%

Show abstract

BackgroundLynch syndrome (LS) is a cancer susceptibility syndrome caused by germline pathogenic variants in DNA mismatch repair (MMR) genes. Due to increased risk of colorectal cancer (CRC), enhanced colonoscopic surveillance is recommended for heterozygote MMR-carriers. ObjectiveUsing a registry of English LS patients linked to digital National Health Service records, we aimed to assess adherence of MMR-carriers to national surveillance guidelines, and to determine the impact of surveillance on CRC incidence and mortality. DesignWe described the frequency of colonoscopies in 4,732 MMR-carriers and used logistic regression to determine predictors of surveillance adherence. For MMR-carriers with a record of surveillance and those without, we: estimated age-specific annual CRC incidence rates (AS-AIRs) and cumulative lifetime risks, assessed for stage-shift by comparing CRC stage distributions and stage-specific AS-AIRs, and estimated risks of death from CRC and any cause using Kaplan-Meier methods and Cox Proportional Hazards regression. ResultsSurveillance at a mean interval of [≤] 3 years (n=3028) was associated with a decrease in CRC-specific and all-cause mortality, without an associated change in total CRC incidence, even after multivariate adjustment. No strong evidence of stage-shift was observed. Colonoscopic surveillance at a mean interval of [≤] 2 years (n=1569) was associated with an increase in total CRC incidence. Incidence of early-stage cancers was also higher, with no corresponding decrease in late-stage cancers, which may reflect the short follow-up period or the impact of overdiagnosis. ConclusionThe observed reduction in all-cause mortality amongst regularly-surveilled MMR-carriers may indicate an impact of surveillance on CRC-specific mortality, though in the context of a non-randomised study likely reflects the influence of selection bias. KEY MESSAGES OF ARTICLEO_ST_ABSWhat is already known on this topicC_ST_ABSRegular surveillance colonoscopy is recommended in Lynch syndrome, though evidence to support this remains mixed. We searched PubMed for articles published from inception to 01/05/2024 using the terms "Lynch syndrome", "HNPCC", "colonoscopy", "sigmoidoscopy", "surveillance", and "screening". We found one controlled trial and several small analytical studies dating from the early 2000s which compared surveilled and non-surveilled populations and found surveillance to be associated with reduced colorectal cancer (CRC) incidence and improved survival. More recent longitudinal observational studies, most without comparator groups, found a high incidence of CRC in LS populations despite being resident in countries where surveillance was recommended. A small number of studies directly assessed time since last colonoscopy against CRC incidence and stage with mixed findings. Finally, cross-sectional comparisons between countries of CRC incidence rates and surveillance interval recommendations found no relationship between the two1,2. What this study addsHere, we conduct an observational cohort study on a large national cohort of MMR germline pathogenic variant (GPV) carriers (MMR-carriers) in England (n=4,732), comparing CRC incidence and mortality in individuals with a record of regular surveillance to those without. Through linkage of the English National Lynch Syndrome Registry to Hospital Episodes Statistics data, we are uniquely able to study a comprehensive national population of MMR-carriers and identify the dates on which colonoscopies were undertaken over time, allowing assessment of adherence to national surveillance guidelines and the impact this has on CRC outcomes. Notably, receipt of regular colonoscopy was strongly associated with deprivation as well as ethnicity. The results show that regular surveillance at an average interval of 3 years (or less) is not associated with a reduction in CRC incidence when compared to less frequent surveillance, but an apparent decrease in both CRC-specific and overall mortality is observed, even after adjustment for confounding variables. Conversely, regular surveillance at an average interval of 2 years (or less) is associated with an increase in CRC incidence when compared to less frequent surveillance, which may suggest increased diagnosis of early-stage cancers or, due to the absence of a reduction in late-stage cancers, overdiagnosis. The observed impact of surveillance on overall mortality may demonstrate the impact of surveillance on CRC-specific mortality, or, in the context of an observational (non-randomised) study, indicate that the results are subject to selection bias. How this study might affect research, practice, or policyEvidence for the benefit of surveillance colonoscopy remains mixed. Whilst polypectomy would be anticipated to prevent CRC development (thus reducing CRC incidence), several studies have observed increased frequency of CRCs in MMR-carriers undergoing frequent surveillance colonoscopy, which may reflect overdiagnosis. The selection bias inherent to observational studies of surveillance renders mortality outcomes challenging to interpret. Randomised controlled trials of colonoscopic surveillance in MMR-carriers are required for effectiveness of this intervention to be accurately assessed. Given ethical and feasibility challenges, randomised controlled trials might be complemented by quasi-experimental designs using advanced observational methods for assessing effectiveness.

2

Phase 1a Evaluation of LP-184 in Recurrent Glioblastoma: Safety, Pharmacokinetics, and Translational Optimization of CNS Exposure

Schreck, K.; Lal, B.; Zhou, J.; Lopez Bertoni, H.; Holdhoff, M.; Ewesudo, R.; Bhatia, K.; Chamberlain, M.; Laterra, J.

2026-04-24 oncology 10.64898/2026.04.21.26351406 medRxiv

Top 0.1%

2.2%

Show abstract

Purpose: Limited CNS bioavailability and pharmacodynamics are obstacles to effective systemic therapies for glioblastoma. One strategy to overcome these challenges is drug combinations enhancing CNS penetration and/or tumor chemosensitivity. LP-184, a synthetic acylfulvene class alkylator, induces DNA damage and inhibits glioblastoma cell viability in pre-clinical models. LP-184 is a prodrug converted to active metabolites by intracellular prostaglandin reductase 1 (PTGR1) that is over-expressed in >70% of glioblastoma. DNA damage induced by LP-184 is MGMT agnostic and reversed by transcription-dependent NER. Patients: LP-184 was evaluated in a Phase 1a study (NCT05933265) in 63 adult patients with advanced malignancies including 16 patients with recurrent glioblastoma. All patients with glioblastoma received prior standard-of-care therapy and most had received 1 or more additional therapies before enrollment. Results: Patients with glioblastoma experienced more frequent transaminitis, Grade 1-2 nausea and a trend towards more frequent and severe thrombocytopenia compared to the non-glioblastoma cohort. Otherwise, overall toxicity profiles were similar. Clinical pharmacokinetic analysis combined with published pre-clinical intra-tumoral bioavailability data (~20% penetration) predicted that LP-184 at the recommended dose for expansion (RDE) would achieve cytotoxic levels if combined with spironolactone, a BBB permeable ERCC3 degrader and TC-NER inhibitor that sensitizes glioblastoma cells to LP-184 3-6-fold. We show that three daily doses of spironolactone deplete orthotopic glioblastoma PDX ERCC3 protein by ~ 80% and increases tumor LP-184 cytotoxicity 2-fold. Conclusions: LP-184 is well tolerated at the RDE, and we establish a clinically translatable scheme for dosing spironolactone in combination with LP-184 for a future Phase 1b clinical trial.

3

Histology-Derived Signatures Predict Recurrence Risk and Chemotherapy Benefit in Randomized Trials of Early Breast Cancer

Howard, F. M.; Li, A.; Kochanny, S.; Sullivan, M.; Flores, E. M.; Dolezal, J.; Khramtsova, G.; Hassan, S.; Medenwald, R.; Saha, P.; Fan, C.; McCart, L.; Watson, M.; Teras, L. R.; Bodelon, C.; Patel, A. V.; Symmans, W. F.; Partridge, A.; Carey, L.; Olopade, O. I.; Stover, D.; Perou, C.; Yao, K.; Pearson, A. T.; Huo, D.

2026-04-24 oncology 10.64898/2026.04.23.26351499 medRxiv

Top 0.2%

1.9%

Show abstract

Purpose: To test whether histology-derived gene-expression signatures from routine hematoxylin and eosin slides are prognostic for recurrence and predictive of chemotherapy benefit in early breast cancer. Methods: We conducted a multi-cohort study including CALGB 9344 (anthracycline +/- paclitaxel), CALGB 9741 (standard vs dose-dense chemotherapy), a pooled Chicago real-world cohort, and the American Cancer Society (ACS) Cancer Prevention Studies-II and -3. Whole-slide images were processed with a previously described pipeline to generate 61 histology-derived signatures per patient. The primary endpoint was distant recurrence-free interval (DRFI), except in ACS, where breast cancer-specific survival was used. Secondary endpoints include distant recurrence-free survival (DRFS) and overall survival. The most prognostic signature in CALGB 9344, selected by Harrell's C-index, was evaluated in additional cohorts. Signature-treatment interaction was assessed by likelihood-ratio tests. Multivariable Cox models incorporating age, tumor size, nodal status, estrogen/progesterone receptor status, and signature were fit in CALGB 9344 to improve risk stratification. Results: A total of 7,170 patients were included across four cohorts. The top histology-derived signature in CALGB 9344 showed strong prognostic performance for 5-year DRFI (C-index 0.63) and performed well across validation cohorts (C-index 0.60, 0.70, and 0.62 in CALGB 9741, Chicago, and ACS, respectively). The strongest predictive signal for treatment benefit was observed for DRFS. High-risk cases identified by the signature demonstrated greater benefit from taxane in CALGB 9344 (adjusted hazard ratio [aHR] 0.76 for DRFS, 95% CI 0.66-0.88; interaction p=0.028), from dose-dense chemotherapy in CALGB 9741 (aHR 0.69, 95% CI 0.56-0.85; interaction p=0.039), and differential chemotherapy benefit in the Chicago cohort (aHR 0.84, 95% CI 0.59-1.21; interaction p=0.009). Combined clinical-histology models improved risk stratification and identified low-risk groups with a 2%-10% risk of distant recurrence or breast cancer death. Conclusion: Histology-derived signatures from H&E images are broadly prognostic and, unlike clinical factors, may predict chemotherapy benefit.

4

Practical Management of Adverse Events Associated with Bispecific Antibodies for the Treatment of Multiple Myeloma: A Qualitative Interview Study

Graham, T. R.; White, M. G.; Blue, B.; Hartley-Brown, M.; Hunter, B. D.; Huynh, C.; Joseph, N.; Keruakous, A.; Pan, D.; Rudolph, P.; Sawhney, R.; Suvannasankha, A.

2026-04-27 oncology 10.64898/2026.04.24.26350878 medRxiv

Top 0.2%

1.4%

Show abstract

PURPOSE: Bispecific antibodies (BsAbs) represent a major advancement in the management of relapsed/refractory multiple myeloma (RRMM), offering high response rates even in heavily pretreated patients. However, their use presents operational, safety, and supportive care complexities that require coordinated care teams, and evolving infrastructure. This manuscript summarizes best practice recommendations for adverse event (AE) management, outpatient operational models, referral pathways, and emerging strategies to optimize long-term tolerability. METHODS: Medlive, A PlatformQ Health Brand, conducted qualitative interviews of academic and community-based clinicians. Discussions focused on BsAb implementation, patient selection and counseling, and AE management. Experts provided recommendations on team-based protocols, transitions of care, and inpatient versus outpatient considerations. RESULTS: Ten hematologists/oncologists (academic n=4; community n=6) described practice patterns, barriers, and perspectives on BsAb use. BsAbs were consistently regarded as highly effective across multiple lines of therapy, particularly for patients without alternatives. Cytokine release syndrome (CRS) was the most common acute toxicity, generally low grade and managed effectively with early tocilizumab, including prophylactic use in outpatient settings. Immune effector cell-associated neurotoxicity syndrome (ICANS) was rare, mild, and best mitigated through early recognition and caregiver support. Infections, largely from BCMA-associated hypogammaglobulinemia, frequently interrupted therapy, necessitating antiviral prophylaxis, pneumocystis jirovecii pneumonia (PJP) prophylaxis, and intravenous immunoglobulin (IVIG). Outpatient step-up dosing is expanding, supported by prophylactic strategies and academic-community collaboration. Timely referral was emphasized to preserving eligibility. Major outpatient challenges included sequencing, infrastructure readiness, and standardized caregiver and staff education. CONCLUSION: Effective community implementation of BsAbs requires multidisciplinary coordination, standardized AE protocols, infection prevention, and infrastructure to support monitoring, referrals, and equitable access. These measures are critical to ensure safe, sustainable integration of bispecific therapies and to optimize patient outcomes.

5

Semaglutide is associated with improved breast cancer survival, lower metastatic burden, and a dose-survival relationship uncoupled from weight-loss magnitude

Murugadoss, K.; Venkatakrishnan, A. J.; Soundararajan, V.

2026-04-24 oncology 10.64898/2026.04.23.26351609 medRxiv

Top 0.3%

0.9%

Show abstract

Metabolic dysfunction is increasingly recognized as a risk factor for poor outcomes in breast cancer, but whether incretin-based therapies confer survival benefit beyond weight loss remains unresolved. Using a federated electronic health record platform spanning nearly 29 million patients, we evaluated breast cancer survival after semaglutide and tirzepatide initiation in routine care. In 1:1 propensity-matched pooled-comparator analyses, semaglutide was associated with improved overall survival versus metformin, sodium-glucose cotransporter 2 (SGLT2) inhibitor, and dipeptidyl peptidase 4 (DPP4) inhibitor users, with 54 deaths among 2,433 semaglutide users (2.2%) versus 395 deaths among 2,433 comparators (16.2%) over 24 months (log-rank P < 0.001). Tirzepatide showed a favorable survival association relative to pooled anti-diabetic comparators that did not meet statistical significance (P = 0.24), with 3 deaths among 220 users (1.4%) versus 64 deaths among 220 comparators (29.1%). In a head-to-head propensity-score-matched comparison, overall survival did not differ significantly between semaglutide and tirzepatide treated patients with pre-existing breast cancer (2,117 per arm; P = 0.12). In semaglutide-treated patients alive and observable at the 1-year landmark, higher maximum dose achieved was significantly associated with lower post-landmark mortality (P = 0.034), with an event rate of approximately 1.0% in the high-dose group (>=1.7 mg) versus approximately 4.5% in the low-dose group (0.25-1.0 mg). Despite a linear dose weight loss relationship for semaglutide, however, weight loss strata did not separate survival outcomes (global P = 0.22). In tirzepatide-treated patients alive and observable at the same landmark, neither maximum dose achieved nor weight loss strata separated post-landmark survival (P = 0.98 and P = 0.50, respectively). Structured EHR and AI-based clinical note analyses further showed significantly lower frequency of documented metastatic disease in semaglutide-treated patients relative to pooled anti-diabetic comparators, including any metastasis (7.0% versus 15.0%, rate ratio 0.5, P < 0.001), bone metastasis (1.0% versus 5.2%, rate ratio 0.2, P < 0.001), and liver, lung, or brain metastases (all P < 0.001). LLM-derived cause-of-death extraction further showed a 60% lower relative proportion of cancer-associated deaths in semaglutide-treated patients (19% of ascertainable deaths) than in matched pooled anti-diabetic comparators (47% of ascertainable deaths), with comparator deaths more often attributed to cancer progression involving metastatic breast cancer, leptomeningeal carcinomatosis, and cancer-driven organ failure. Overall, this study demonstrates that semaglutide use in patients with pre-existing breast cancer is associated with a dose correlated but weight loss independent improvement in overall survival. These findings motivate prospective trials of GLP-1 receptor agonists in breast cancer across various stages and treatment settings.

6

Onca: An Open 9B Language Model for Pancreatic Cancer Clinical Tasks

Shim, K. B.

2026-04-24 oncology 10.64898/2026.04.16.26351055 medRxiv

Top 0.3%

0.9%

Show abstract

Pancreatic ductal adenocarcinoma (PDAC) remains one of the deadliest solid tumors and continues to face low treatment-trial participation, fragmented evidence workflows, and labor-intensive ab- straction of unstructured clinical text. Existing oncology-focused language models show promise, but many depend on private institutional corpora, limiting reproducibility and practical reuse across centers. We present Onca, an open 9B dense model designed for four PDAC-relevant tasks: trial eligibility screening, case-specific clinical reasoning, structured pathology report extraction, and molecular variant evidence reasoning. Onca is fine-tuned from Qwopus3.5-9B-v3 with a single Un- sloth BF16 LoRA adapter on 37,364 training rows drawn from openly available sources. The evalu- ation spans 11 panels and compares Onca against Woollie-7B, CancerLLM-7B, OpenBioLLM-8B, and the unmodified Qwopus base. Onca achieves the strongest overall results on Trial Screening (81.6 F1), Clinical Reasoning (14.1 composite), Pathology Extraction (30.5 field exact-match), Pub- MedQA Cancer (68.3 macro-F1), and PubMedQA (66.5 macro-F1). The strongest gains appear in tasks closest to routine oncology workflow, especially trial review and pathology structuring. These findings suggest that clinically targeted pancreatic-cancer language models can be built from open data with competitive performance while remaining practical to train on a single workstation-scale GPU setup.

7

Methodological and Clinical Validation of TholdStormDX v0.0.1: An Advanced Stochastic Engine for the Optimization of Thresholds and Multimarker Panels Applied to Oncology

Reinosa, R.

2026-04-27 oncology 10.64898/2026.04.24.26351692 medRxiv

Top 0.3%

0.9%

Show abstract

Introduction: The translation of biomarkers into binary clinical decisions requires the determination of precise cut-off points. This study validates the TholdStormDX v0.0.1 tool, a mathematical engine that employs Dual Annealing, 2- and 4-parameter logistic fitting, and vectorized Monte Carlo simulations for panel optimization under Boolean OR logic. Methods: The tool was evaluated using datasets from four diagnostic domains (Pulmonary Nodules, Hepatocellular Carcinoma [HCC], Cervical Cancer, and Breast Cancer), along with a prognosis-oriented analytical context (Breast Cancer). Validation followed a strict workflow: characterization and selection of the best individual and combined thresholds in the Training (Train) and Validation (Val) sets, using the Test set in a completely independent manner, solely to assess the model s performance and generalizability. Results: The tool enabled precise derivation of cut-off points for both individual biomarkers and multivariable combinations. Evaluation on the Test set objectively demonstrated in which scenarios a single biomarker outperforms a complex panel, promoting clinical parsimony. For example, in Breast Cancer diagnosis, an individual predictor outperformed the optimized panel (Sensitivity: 0.953 / Specificity: 0.952 in Test); conversely, in Hepatocellular Carcinoma, the multivariable combination showed superior performance compared to the single marker (Sens: 0.707 / Spe: 0.718 in Test). Additionally, the self-auditing system effectively flagged metric degradation when noisy variables were included, preventing potential issues. Conclusion: TholdStormDX v0.0.1 proves to be a robust and transparent bioinformatics platform for deriving clinical thresholds. Its main contribution lies in mitigating local minima and promoting clinical parsimony, enabling researchers to objectively identify when a single biomarker is sufficient and when a panel provides real added value. Furthermore, it transforms the problem of biological noise into a safety feature: by systematically warning about algorithmic instability, it prevents overfitting and ensures the clinical viability of medical decisions. Availability: The software is free and distributed under the GNU GPLv3 license. TholdStormDX v0.0.1 is written in Python, and its source code is available at the following GitHub address: https://github.com/roberto117343/TholdStormDX.

8

CT-Based Deep Foundation Model for Predicting Immune Checkpoint Inhibitor-Induced Pneumonitis Risk in Lung Cancer

Muneer, A.; Showkatian, E.; Kitsel, Y.; Saad, M. B.; Sujit, S. J.; Soto, F.; Shroff, G. S.; Faiz, S. A.; Ghanbar, M. I.; Ismail, S. M.; Vokes, N. I.; Cascone, T.; Le, X.; Zhang, J.; Byers, L. A.; Jaffray, D.; Chang, J. Y.; Liao, Z.; Naing, A.; Gibbons, D. L.; Vaporciyan, A. A.; Heymach, J. V.; Suresh, K. S.; Altan, M.; Sheshadri, A.; Wu, J.

2026-04-23 oncology 10.64898/2026.04.21.26351428 medRxiv

Top 0.3%

0.8%

Show abstract

Background: Immune checkpoint inhibitors (ICIs) have revolutionized cancer therapy but can cause serious immune-related adverse events (irAEs), with pneumonitis (ICI-P) being among the most severe. Early identification of high-risk patients before ICI initiation is critical for closer monitoring, timely intervention, and improved outcomes. Purpose: To develop and validate a deep learning foundation model to predict ICI-P from baseline CT scans in patients with lung cancer. Methods: We designed the Checkpoint-Inhibitor Pneumonitis Hazard EstimatoR (CIPHER), a deep learning foundation model that combines contrastive learning with a transformer-based masked autoencoder to predict ICI-P from baseline CT scans in patients with lung cancer. Using self-supervised learning, CIPHER was pre-trained on 590,284 CT slices from 2,500 non-small cell lung cancer (NSCLC) patients to capture heterogeneous lung parenchymal patterns. After pre-training, the model was fine-tuned on an internal NSCLC cohort for ICI-P risk prediction, using images from 254 patients for model development and 93 patients for internal validation. We compared CIPHER with classical radiomic models and further evaluated it on an external NSCLC cohort of 116 patients. Results: In the internal immunotherapy cohort, CIPHER consistently distinguished patients at elevated risk of ICI-P from those without the event, with AUCs ranging from 0.77 to 0.85. In head-to-head benchmarking, CIPHER achieved an AUC of 0.83, outperforming the radiomic models. In the external validation cohort, CIPHER maintained strong performance (AUC = 0.83; balanced accuracy = 81.7%), exceeding the radiomic models (DeLong p = 0.0318) and demonstrating higher specificity without sacrificing sensitivity. By contrast, the radiomic model showed high sensitivity (85.0%) but markedly lower specificity (45.8%). Confusion matrix analysis confirmed the robust classification performance of CIPHER, correctly identifying 80 of 96 non-ICI-P cases and 16 of 20 ICI-P cases. Conclusions: We developed and externally validated CIPHER for predicting future risk of ICI-P from pre-treatment CT scans. With prospective validation, CIPHER may be incorporated into routine patient management to improve outcomes.

9

Mechanistic learning to predict and understand minimal residual disease

Marzban, S.; Robertson-Tessi, M.; West, J.

2026-04-21 cancer biology 10.64898/2026.04.16.718968 medRxiv

Top 0.5%

0.6%

Show abstract

Mechanistic modeling has long been used as a tool to describe the dynamics of biological systems, especially cancer in response to treatment. Their key advantage lies in interpretability of relationships between input parameters and outcomes of interest. In contrast, machine learning techniques offer strong prediction performance, especially for high dimensional datasets that are common in oncology. Here, we employ a Mechanstic Learning framework that combines the advantages of both approaches by training machine learning models on mechanistic parameters inferred from clinical patient data. The mechanistic model (a Markov chain model) contains sixteen parameters that describe the rate of cell fate transitions that occur in patients with B-cell precursor acute lymphoblastic leukemia. The machine learning (a ridge logistic regression model) is trained on these parameters to predict two clinically-relevant features: BCR::ABL1 fusion gene status (positive or negative) and minimal residual disease status (positive or negative) post-induction chemotherapy. Model training is done in an iterative fashion to assess which (and how many) parameters are critical to maintain high predictive performance. Using machine learning models trained on the clinical flow-cytometry data, we find that the stem-like cell state alone is the most predictive feature for both BCR::ABL1-positive and MRD-positive disease, with combination scores (defined as the average of accuracy, balanced accuracy, and area under the curve) of 0.80 and 0.67, respectively. By comparison, mechanistic learning achieves comparable or improved combination scores for BCR::ABL1-positive and MRD-positive disease, with scores of 0.81 and 0.71, respectively, using only de-differentiation for BCR::ABL1 and primitive-state persistence together with differentiation-directed exit for MRD. Thus, the mechanistic-learning approach not only preserves predictive performance, but also provides a biological hypothesis for why stemness is predictive of these clinically relevant outcomes.

10

Tumor Biology and Patterns of Recurrence in High-Grade Glioma: Implications for Radiation Target Delineation

Barve, R.; Gowda, D.; Illiayaraja, K. J.

2026-04-25 oncology 10.64898/2026.04.23.26351633 medRxiv

Top 0.5%

0.6%

Show abstract

Abstract: Purpose: Recurrence in high grade glioma (HGG) predominantly occurs within the high dose radiation field, raising the question of whether treatment failure reflects limitations in radiation target delineation or is driven by intrinsic tumor biology. This study evaluated recurrence patterns following standard chemoradiotherapy and their treatment implications. Material and Methods: This retrospective single center study included 41 patients with histologically confirmed HGG treated with surgery followed by radiotherapy with concurrent and adjuvant temozolomide (TMZ). Patients were followed through August 2018; those with recurrence were included in the analysis. Recurrence patterns were classified based on their spatial relationship to the 60 Gy isodose line as central, infield, marginal, or distant. Survival outcomes were estimated using the Kaplan-Meier method and compared using the log rank test. Results: The most common pattern of recurrence was central (15 patients, 36.5%), followed by infield (11, 26.8%), distant (6, 14.6%), marginal (5, 12.1%), and multicentric (4, 9.8%). Central and in field recurrences (local failures) accounted for 26 patients (63%). Median overall survival (OS) was 27 months, and median progression-free survival (PFS) was 12 months. Survival differed significantly by recurrence pattern (log-rank p = 0.018), with marginal recurrence associated with more favorable outcomes. Conclusion: The predominance of central and infield recurrences within the high-dose region suggests that treatment failure in HGG is not solely explained by inadequate target delineation and may also be driven, in part, by intrinsic tumor biology, including radioresistant subpopulations and tumor heterogeneity. Future strategies may benefit from incorporating biologically guided approaches alongside optimization of radiation treatment parameters.

11

Estimation of cancer cases in transgender and gender diverse people in England

Pasin, C.; Jackson, S. S.; Thynne, L.-E.; McWade, B.; Westerman, T.; Ball, R.; Kavanagh, J.; O'Callaghan, S.; Ring, K.; Orkin, C.; Berner, A. M.

2026-04-22 oncology 10.64898/2026.04.21.26351378 medRxiv

Top 0.6%

0.4%

Show abstract

ObjectivesTo estimate current, and 5- and 10-year projected, number of cases of cancer per year in transgender and gender diverse (TGD) people in England, overall and by tumour type, accounting for uptake of gender affirming care (GAC). DesignPopulation-based epidemiological modelling study using an age-stratified Monte Carlo simulations approach and the NORDPRED method for predictions. SettingModels estimating cancer case numbers for TGD people in England based on publicly available 2023 cancer surveillance data and survey-based 2025 GAC access, and predicted at 5 and 10 years hence. ParticipantsTGD people aged 15 years and above. Main outcome measuresPrimary cancer cases per year overall, by gender, age group, tumour type, and current and planned GAC. ResultsThe estimated TGD population size in England is 441547 (95% uncertainty interval (UI) 429207- 452890). Total cases per year of cancer in TGD people is expected to be 966 (95% UI 882-1069) excluding non-melanoma skin. Most cases are expected to occur in people aged 60-64. The top 5 expected cancers in TGD people are breast (19%, n = 187, 95% UI 149-241), colorectal (12%, n = 117, 95% UI 106-129), lung (11%, n = 108, 95% UI 96-122), melanoma (7.1%, n = 69, 95% UI 64-74) and urinary (6.2%, n = 60, 95% UI 54-67). Total cases of cancer in TGD people are estimated to be 1740 (95% UI 1584-1934) in 5 years and 2258 (95% UI 2066-2507) in 10 years (excluding non-melanoma skin). If TGD people were able to access their planned level of GAC, this would reduce these figures to 1555 (95% CI 1386-1766) and 2012 (95% CI 1797-2282) respectively. ConclusionsThis study provides prediction of cancer cases in TGD people in England, supporting the planning of service provision and training. This is vital, as with increasing disclosure, and long wait times for GAC, cancer cases in TGD people are predicted to increase. Summary BoxesO_ST_ABSWhat is already known on this topicC_ST_ABSThe annual number of cases of cancer in transgender and gender diverse (TGD) people in England is currently unknown as gender incongruence is not collected as part of the National Cancer Registration and Analysis Service. Some gender-affirming care (GAC) interventions are known to modulate cancer risk. Use of testosterone and chest reconstruction for transmasculine people is known to reduce their incidence of breast cancer compared to cisgender women. Use of oestradiol alongside medical or surgical androgen suppression has been shown to reduce the incidence of prostate cancer in transfeminine people while increasing their risk of breast cancer, compared to cisgender men. What this study addsThis study found that there are likely to be approximately 966 cases of cancer (excluding non-melanoma skin) in TGD people per year in the UK. Though total annual cases of cancer in TGD people are expected to be 2258 in 10 years, improved access to gender-affirming care could reduce total cases to 2012 (a 11% reduction). These figures provide additional justification for funding to improve access to GAC via the National Health Service (NHS), as well as for training on the oncological needs of this population.

12

Comparing Gleason Pattern 4 Measurement Approaches on Prostate Biopsy Using Machine Learning: A Proof-of-Principle Study

Buzoianu, M. M.; Yu, R.; Assel, M.; Bozkurt, A.; Aghdam, H.; Fine, S.; Vickers, A.

2026-04-24 oncology 10.64898/2026.04.23.26351615 medRxiv

Top 0.6%

0.4%

Show abstract

Objective: To demonstrate the proof of principle that machine learning (ML) can be used to quantify Gleason Pattern (GP) 4 on digitized biopsy slides using multiple measurement approaches, allowing direct comparison of their prognostic performance. Methods: We assembled a convenience sample of 726 patients with grade group 2-4 prostate cancer on systematic biopsy who underwent radical prostatectomy between 2014 and 2023. Digitized biopsy slides were analyzed using a machine-learning algorithm (PAIGE-AI) to quantify GP4 using multiple measurement approaches, particularly with respect to how gaps between cancer foci (interfocal stroma) were handled. GP4 extent was quantified using linear measurements or a pixel-based area metric. Discrimination of each GP4 quantification approach, along with Grade Group (GG), was assessed for adverse radical prostatectomy pathology and biochemical recurrence. Results: We identified 15 different quantification approaches and observed differences between their discrimination. The highest discrimination was in the pixel-counting method (AUC 0.648). GP4 quantification outperformed GG for predicting adverse pathology (AUC 0.627 vs 0.608). Amount of GP3 was non-predictive once GP4 was known. These findings were consistent for BCR. Conclusions: We were able to measure slides using 15 distinct measurement approaches and replicated prior findings using ML to quantify GP4. Our findings support the use of ML as a research tool to compare different GP4 quantification approaches. We intend to use our method on larger cohorts to determine with which measurement approach best predicts oncologic outcome.

13

Chinese Herbal Medicine as a complementary therapy for the management of Colorectal Cancer: Study protocol for a Delphi Expert Consensus survey

Ng, C. Y.; Liu, M.; Ai, D.; Yao, L.; Yang, M.; Zhong, L. L.

2026-04-22 oncology 10.64898/2026.04.21.26350990 medRxiv

Top 0.6%

0.4%

Show abstract

IntroductionColorectal cancer (CRC) remains a leading cause of cancer-related morbidity and mortality worldwide, despite advances in conventional oncological therapies. In recent years, various studies have made advances in integrative oncology, such as investigating the use of Chinese Herbal Medicine (CHM) as a complementary therapy alongside conventional oncological therapies to alleviate treatment-related adverse effects, improve quality of life, and potentially enhance therapeutic outcomes. Despite this, clinical practice in this area remains highly heterogeneous, with limited standardized guidelines on key areas of concern such as (1) optimal intervention, (2) recommended stage and duration of intervention, (3) safety considerations, and (4) possible herb-drug interactions. Hence, this study aims to establish expert consensus on the usage of CHM as a complementary therapy in the management of CRC, to support safe, consistent, and evidence-informed clinical practice. Methods and AnalysisWe will employ a modified Delphi technique to achieve consensus amongst a panel of international experts in various fields related to integrative oncology. Prior to the study, a list of questionnaire items was developed based on a systematic review of existing clinical practice guidelines on CRC. An international panel will be invited based on established international profile in integrative oncology research and clinical practice, and by peer referral. Two rounds of Delphi will be conducted using anonymous online questionnaires. Consensus will be considered reached if at least 50% of the panel strongly agree/disagree that an item should be included or excluded while strong consensus will be set at 76%. Items which achieve strong consensus after Round 1 will be removed, before being sent out for Round 2 with a summary of Round 1 responses for a final consensus. Ethics and DisseminationEthics approval has been obtained from the Institutional Review Board of Nanyang Technological University (IRB-2025-1222). Our findings will be disseminated through peer-reviewed publications and conference presentations. Strengths and limitations of this studyO_LIThis study will develop an expert consensus which aims to guide future integration of Chinese Herbal Medicine (CHM) as a complementary therapy into colorectal cancer (CRC) management. C_LIO_LIKey concerns in areas such as determining the (1) optimal intervention, (2) recommended stage and duration of intervention, (3) safety considerations, and (4) possible herb-drug interactions, thereby laying the groundwork for potential future incorporation of CHM into CRC treatment protocols alongside conventional oncology approaches has been identified, thus limiting implementation in clinical practice. C_LIO_LIDesigning a study e-guide, followed by the consensus rounds study online will facilitate participants responses and the dissemination of information from previous rounds. C_LI

14

A catalogue of missense and nonsense mutation abundances for the U.S. cancer patient population

Arun, A.; Liarakos, D.; Mendiratta, G.; McFall, T.; Hargreaves, D. C.; Wahl, G. M.; Hu, J.; Stites, E. C.

2026-04-22 oncology 10.64898/2026.04.20.26351248 medRxiv

Top 0.6%

0.4%

Show abstract

Widespread genomic sequencing efforts have characterized the molecular foundations of the different cancers. By combining these genomic data in a manner proportional to the population-level abundances of these different cancers, we estimate the overall abundances of each observed missense and nonsense mutation within the U.S. cancer patient population. We find BRAF V600E (5.2%) is the most common mutation in the cancer patient population, TP53 R175H (1.5%) is the most common tumor suppressor mutation, and APC R876X (0.4%) is the most common nonsense mutation. These values differ largely and significantly from what would be found in a typical pan-cancer analysis, where different cancer types are included out of proportion to population level incidence. We present the full ordered lists of population-level abundances for specific missense and nonsense mutations, and we demonstrate the value of these data by further analyzing high priority genes (e.g., TP53, KRAS, BRAF) and pathways (e.g., RTK/RAS, PI3K, and WNT/{beta}-catenin). Overall, this information is a resource that should benefit the basic science, translational, and clinical cancer research communities.

15

Attention-Guided CNN Ensemble for Binary Classification of High-Grade and Low-Grade Serous Ovarian Carcinoma from Histopathological WSI Patches

rani, a.; mishra, s.

2026-04-22 oncology 10.64898/2026.04.21.26351441 medRxiv

Top 0.6%

0.4%

Show abstract

Accurate histopathological differentiation between High-Grade Serous Carcinoma (HGSC) and Low-Grade Serous Carcinoma (LGSC) remains a critical yet challenging aspect of ovarian cancer diagnosis due to their similar morphology and different clinical outcomes. This study presents a deep learning framework that uses custom attention mechanisms, including the Convolutional Block Attention Module (CBAM), Squeeze-and-Excitation (SE) blocks, and a Differential Attention module within five CNN architectures for automated binary classification of ovarian cancer subtypes from H&E WSI patches. Although individual models achieved higher accuracy, the ensemble stacking framework with a shallow MLP meta-learner delivered the best overall performance, with a ROC-AUC of 0.9211, an accuracy of 0.85, and F1-scores of 0.84 and 0.85 across both subtypes. These findings demonstrate that attention-guided feature recalibration combined with ensemble stacking provides robust and clinically interpretable discrimination of ovarian carcinoma subtypes.

16

A Context-Aware Target Engagement and Pharmacodynamic Biomarker Resource to Accelerate Drug Discovery and Development

Yang, Y.; Zhao, L.; Orouji, S.; Zhu, Y.; Johnson, R. L.; Maxwell, D. S.; Mica, I.; Russell, K. P.; Al-lazikani, B.

2026-04-22 bioinformatics 10.64898/2026.04.19.719411 medRxiv

Top 0.8%

0.2%

Show abstract

Confirming target engagement in tumor experimental models remains a major challenge in oncology drug development. Pharmacodynamic biomarkers can help address this, but few systematic resources link drug targets to candidate biomarkers. We developed TargetTrace, a comprehensive resource to identify and prioritize pharmacodynamic biomarkers across nine key target classes, including transcription factors/cofactors, kinases, phosphatases, ubiquitin ligases, deubiquitinases, acetyltransferases, deacetylases, methyltransferases, and demethylases. Biomarker candidates were gathered from curated molecular interaction resources and refined using external annotations to improve accuracy. For enzyme targets with measurable substrate changes, we applied a two-agent large language model workflow, followed by manual review, to harmonize antibody information from the antibody resources and ensure that the selected biomarkers are measurable with existing laboratory tests. From more than 92,000 input interactions and over 2,300 targets, we compiled 71,323 target-biomarker relationships involving 2,270 potential drug targets, encompassing both transcription factor/cofactor-target gene and enzyme-substrate interactions. Commercial antibodies were available for over 1,400 biomarkers, supporting laboratory validation. This resource provides a structured and reusable resource for systematic identification and prioritization of pharmacodynamic biomarkers in oncology.

17

Consensus Through Diversity: A Comprehensive Benchmark of Multi-Omic Approaches for Precision Breast Oncology

Sionakidis, A.; Pinilla Alba, K.; Abraham, J.; Simidjievski, N.

2026-04-21 bioinformatics 10.64898/2026.04.17.719159 medRxiv

Top 0.9%

0.2%

Show abstract

Emerging multi-omic profiling has made it feasible to subtype disease using multiple molecular layers. However, inconsistent preprocessing, heterogeneous implementations, variable evaluation, and limited reproducibility often constrain method selection. Here, we systematically benchmark 22 publicly available unsupervised approaches for bulk data on the TCGA-BRCA cohort across five modalities (RNA-seq, miRNA, DNA methylation, copy numbers, single nucleotide polymorphisms) and validate findings in two independent datasets, enabling a multi-layered comparison of performance, heterogeneous data support and interpretability. Most approaches fuse multi-omic data to produce a two-cluster solution largely aligned with ER status, with higher-resolution approaches further refining these into four coherent subclasses (angiogenic luminal, oxidative-phosphorylation/HER2-low luminal, immune-inflamed basal-like, and hyper-proliferative basal-like). Our benchmarking results indicate that methods based on similarity networks can efficiently produce stable, reliable partitions. Matrix factorisation and Bayesian factorisation algorithms produce rich latent representations, allowing quantification of feature and modality contributions, albeit at higher computational cost. Consensus clustering can be used on a case-by-case basis and refine partitions into more robust and generalisable findings. We aggregate our insights into a decision workflow that aligns with study goals, data characteristics, and computational resources, enabling optimal analytic strategies. This comprehensive assessment provides a practical roadmap for investigators seeking to extract reproducible, biologically meaningful subtypes from complex multi-omic datasets. We higlight the different technical and practical benefits and trade-offs that shape the selection and development of multi-omic approaches applied in precision oncology.

18

A novel hyperactive BCR::ABL1e6a3 variant confers resistance to combined asciminib plus ponatinib therapy

Nardi, V.; Schwieterman, J.; Ansari, S.; Kincaid, Z.; Azhar, M.; Yousuf, T.; Amir, N.; Khan, A.; Kesarwani, M.; Ryall, S.; Brunner, A. M.; Capilla Guerra, M. R.; Griffin, G. K.; Nassar, N.; Daley, G. Q.; Azam, M.

2026-04-24 oncology 10.64898/2026.04.14.26349982 medRxiv

Top 0.9%

0.2%

Show abstract

Despite considerable advances, the emergence of treatment resistance to tyrosine kinase inhibitors (TKIs) therapy remains a significant challenge in chronic myeloid leukemia (CML). Here, we report the first clinical case of resistance to combined ponatinib and asciminib therapy in a CML patient who relapsed with B lymphoblastic blast crisis. While at presentation the patient harbored the canonical e13a2 BCR::ABL1 fusion, at relapse his disease harbored the T315I mutation together with a novel e6a3 BCR::ABL1 fusion, arisen by internal deletion in the original translocated allele. Structural modeling and biochemical analyses demonstrated that deletion of exon 2-encoded residues of ABL1 destabilizes the autoinhibited conformation, resulting in a hyperactive kinase with increased propensity for B-cell differentiation. Functional studies revealed that both BCR::ABL1e6a3 and BCR::ABL1e6a3/T315I conferred resistance to ponatinib and asciminib, alone or in combination. BCR::ABL1e6a3 demonstrated enhanced sensitivity to active-state selective inhibitors dasatinib and bosutinib, whereas BCR::ABL1e6a3/T315I remained resistant. Combined drug sensitivity assays showed that axitinib restored inhibitory activity when combined with ponatinib or asciminib. Strikingly, a combination of axitinib and asciminib with low dose ponatinib fully suppressed enzymatic activity of BCR::ABL1e6a3/T315I and cellular proliferation. These data show that treatment with asciminib and ponatinib can select for mutations with notably elevated enzymatic activity, effectively targeted by an axitinib-based triple combination. These data highlight the remarkable mutability of the BCR::ABL1 kinase, including through novel isoforms and provides a strong rationale for the clinical assessment of a triple inhibitor combination as a strategy to overcome resistance to dual ponatinib and asciminib therapy.

19

A Cross-Cohort Validated Plasma Lipid Biomarker Assay for Early Breast Cancer Detection Using Machine Learning

Huang, T.; Koch, F. C.; Peake, D. A.; Adam, K.-P.; David, M.; Li, D.; Heffernan, K.; Lim, A.; Hurrell, J. G.; Preston, S.; Baterseh, A.; Vafaee, F.

2026-04-23 oncology 10.64898/2026.04.23.26351564 medRxiv

Top 0.9%

0.2%

Show abstract

Early detection of breast cancer remains essential for improving clinical outcomes, and complementary non-invasive approaches are needed to support existing screening methods, particularly for women with dense breast tissue. We have previously reported plasma lipid biomarker discovery using untargeted high-resolution liquid chromatography tandem mass spectrometry (LC-MS/MS). In this study, we performed biomarker confirmation and developed machine-learning models applied to targeted plasma lipid measurements for the non-invasive detection of early-stage breast cancer across international cohorts with independent external validation. Targeted LC-MS/MS was used to quantify candidate lipid panels in plasma samples from European discovery cohorts (n = 554) and an independent Australian cohort (n = 266) used for external validation. Data-driven feature selection identified a 15-lipid panel with strong performance in European cohorts (AUC >= 0.94). External validation prior to confidence stratification yielded 76% sensitivity, 64% specificity, and an AUC of 0.81 in the Australian validation cohort. Clinical assay development requires iterative panel and model testing to support translational feasibility and performance in the intended-use population. An analytically viable panel, excluding lipids requiring complex and costly synthesis, achieved comparable accuracy with improved assay robustness. Confidence-based analysis showed enhanced performance for predictions made with moderate to high confidence, with sensitivity up to 89% and AUC up to 0.85, suggesting that ongoing research should focus on strategies to enhance diagnostic model confidence. Importantly, model predictions were independent of breast density, tumour size, grade, subtype, and morphology, indicating biological specificity of the lipid signature. These results demonstrate that calibrated machine-learning models applied to plasma lipid biomarkers can support non-invasive breast cancer detection. Expanding training datasets to include greater diversity will further improve performance in the ongoing development of this lipid-based detection approach.

20

Metabolomic Profiling of Dried Blood Spots for Breast Cancer Detection: A Multi-Classifier Validation Study in 2,734 Participants

Anctil, N.; Hauguel, P.; Noel, L.-P.

2026-04-27 oncology 10.64898/2026.04.24.26351695 medRxiv

Top 1.0%

0.1%

Show abstract

Background. Breast cancer (BC) remains the most diagnosed malignancy and leading cancer-related cause of mortality in women worldwide. Although blood-based untargeted metabolomics has emerged as a promising modality for detecting early-stage BC, the clinical translation of this approach has been bottlenecked by two unresolved issues: (i) the field has almost exclusively relied on serum or plasma, which require venipuncture and cold-chain logistics, and (ii) machine-learning models reported on such data are frequently validated with protocols that are blind to analytical batch structure, producing optimistically biased performance estimates. Methods. We present a breast cancer detection study based on dried blood spots (DBS), an analytical matrix that enables self-collection and ambient-temperature shipping. A cohort of 2,734 participants (114 biopsy-confirmed BC cases; 2,620 non-cancer controls) was profiled by untargeted LC-MS/MS on a Thermo Scientific Orbitrap IQ-X coupled to a Vanquish UHPLC. A 39-metabolite panel meeting MSI Level 1 identification criteria was pre-specified a priori from the published breast-cancer metabolomics literature, frozen prior to LC-MS acquisition, and applied to the present cohort without any feature selection on the data. Six standard supervised-learning architectures (LASSO, Elastic Net, Linear SVM, PLS-DA, OPLS-DA, XGBoost) were evaluated on this pre-specified panel; OPLS-DA is reported only in the sex-matched subgroup analysis where a single-seed 5-fold stratified protocol permits a directly comparable fit. Per-batch control-median normalization is applied upstream; kNN imputation, log transform, and robust scaling are fit within each training fold. The evaluation battery comprises batch-aware StratifiedGroupKFold CV at single-seed (seed=42) with inter-seed SD quantified across 10 independent seeds, batch-aware nested CV, a 100-seed held-out 20%-batch validation with disjoint-batch isotonic probability calibration (30% calibration partition), PPV/NPV reporting at multiple operating points and three deployment prevalences, subgroup analyses by TNM stage and tumor grade, pathway-ablation sensitivity analysis, and a 1,000-iteration permutation test. Results. Under batch-aware evaluation (StratifiedGroupKFold, single-seed=42), AUC ranged from 0.914 to 0.949 across classifiers, with LASSO achieving 0.928 and XGBoost 0.949; inter-seed SD across 10 seeds was 0.002-0.006. At 95% specificity, LASSO reached 75.4% sensitivity and XGBoost 81.6%. Held-out batch validation (100 seeds) yielded mean AUC 0.912 for Elastic Net and 0.935 for XGBoost, confirming robust generalization. All 39 panel features showed high coefficient stability, and permutation testing on representative classifiers (LASSO, Linear SVM, PLS-DA) yielded p <= 0.001. Subgroup analyses showed weaker detection of stage IIA tumors (AUC 0.87, n=40) compared with stage IIB/IIIA (AUC 0.95), consistent with stronger metabolic signatures in more advanced disease. Bootstrap coefficient consistency of the Elastic Net classifier confirmed that all 39 panel features received a non-zero multivariate weight in >=80% of 100 stratified bootstraps. Conclusions. On this cohort of diagnosed, pre-treatment breast-cancer cases, DBS LC-MS metabolomic profiling delivers classification performance (AUC 0.928 for LASSO and 0.949 for XGBoost under batch-aware GroupKFold CV at single-seed=42; held-out AUC 0.912-0.935) that is robust across classifier families and biological pathways. The DBS matrix is non-radiating, self-collectable by finger-prick, and mailable at ambient temperature. Performance is weaker on stage IIA than on more advanced disease, and prospective validation in an independent asymptomatic screening cohort is required before clinical positioning as a decentralized triage modality.